Contents
  1. 1. JOIN 语法
  2. 2. JOIN算法
    1. 2.1. simple nested loop join
    2. 2.2. index nested loop join
    3. 2.3. block nested loop join

JOIN 语法

mysql> select * from t1;
+---+------+
| a | b    |
+---+------+
| 1 | 1    |
| 2 | 11   |
| 3 | 12   |
| 5 | 50   |
+---+------+
4 rows in set (0.01 sec)

mysql> 

mysql> select * from t2;
+---+------+
| a | b    |
+---+------+
| 2 | 2    |
+---+------+
1 rows in set (0.01 sec)

mysql> 

语法一

mysql> select * from t1,t2 where t1.a=t2.a;
+---+------+------+------+
| a | b    | a(2) | b(2) |
+---+------+------+------+
| 2 | 11   | 2    | 2    |
+---+------+------+------+
1 rows in set (0.02 sec)

mysql> explain select * from t1,t2 where t1.a=t2.a;
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| 1  | SIMPLE      | t2    | index  | PRIMARY       | b       | 123     | NULL      | 1    | Using index |
| 1  | SIMPLE      | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a | 1    | NULL        |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
2 rows in set (0.06 sec)

mysql> 

语法二

mysql> select * from t1 inner join t2 on t1.a=t2.a;
+---+------+------+------+
| a | b    | a(2) | b(2) |
+---+------+------+------+
| 2 | 11   | 2    | 2    |
+---+------+------+------+
1 rows in set (0.02 sec)

mysql> 
mysql> explain select * from t1 inner join t2 on t1.a=t2.a;
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| 1  | SIMPLE      | t2    | index  | PRIMARY       | b       | 123     | NULL      | 1    | Using index |
| 1  | SIMPLE      | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a | 1    | NULL        |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
2 rows in set (0.01 sec)

语法三

mysql> select * from t1 join t2 on t1.a=t2.a;
+---+------+------+------+
| a | b    | a(2) | b(2) |
+---+------+------+------+
| 2 | 11   | 2    | 2    |
+---+------+------+------+
1 rows in set (0.04 sec)

mysql> explain select * from t1 join t2 on t1.a=t2.a;
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
| 1  | SIMPLE      | t2    | index  | PRIMARY       | b       | 123     | NULL      | 1    | Using index |
| 1  | SIMPLE      | t1    | eq_ref | PRIMARY       | PRIMARY | 4       | test.t2.a | 1    | NULL        |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+-------------+
2 rows in set (0.02 sec)

mysql> 

通过上述的EXPLAIN可以得知,三种JOIN语法在执行性能和效果上都是一样的。

JOIN算法

  • nsted_loop join
    1. simple nested-loop join
    2. index nested-loop join
    3. block nested-loop join

simple nested loop join

simple nested_loog join算法可以理解成两个for循环,外循环走一次,内循环走N次(N=外循环的次数) 其算法伪代码如下:

For each row r in R do # 扫􏰀R表 
  For each row s in S do # 扫􏰀S表
    If r and s satisfy the join condition # 如果r和s满足join条件 
      Then output the tuple # 那就输出结果集
  1. R 表,该表只扫描了一次;
  2. S 表,该表扫面了 count(R) 次;
  3. 该方法相当于是一个笛卡尔积,实际上数据库不会使用该算法;

index nested loop join

index nested_loop join算法是将外表扫􏰀一次,内表不会直接去扫描,而是查找内表相应的索引的方式,和外表的记录进行匹配

For each row r in R do # 扫􏰀R表
  lookup s in S index # 查询S表的索引(固定3~4次IO,B+树高度)
    if found s == r # 如果 r匹配了索引s Then output the tuple # 输出结果集
  1. S表上有索引
  2. 扫描R表,将R表的记录和S表中的索引进行匹配
  3. R表上可以没有索引
  4. 优化器倾向使用记录数少的表作为外表(又称驱动表)

    如果数据量大,index nested loop join的成本也是高的,尤其是在二级索引的情况下,需要大量的回表操作

block nested loop join

block nested loop join将外表中的需要join匹配的列(不是完整的记录)暂时保存在一块内存(join buffer)中,让后将这块内存中的数据和内表进行匹配(内表只扫描一次) block nested loop join 可被用于联接的是ALL,index,range的类型

For each tuple r in R do
  store used columns as p from R in join buffer # 将部分或者全部R的记录保存到 join buffer中,记为p 
    For each tuple s in S do
    If p and s satisfy the join condition # p 与 s满足join条件 
      Then output the tuple # 输出为结果集

block nested loop join 与 simple nested loop join 相比,多了一个 join buffer

mysql> show variables like "%join%buffer%";
+------------------+-----------+
| Variable_name    | Value     |
+------------------+-----------+
| join_buffer_size | 134217728 |  -- 128M,默认是256K
+------------------+-----------+
1 rows in set (0.05 sec)

mysql> 

join buffer用的不是Buffer Pool中的内存,而是线程级别的内存。 可以通过explain查看执行计划,并通过 join条件字段 的大小,来预估 join_buffer_size 的大小。

注意:
增大join_buffer_size 进行优化的前提是 没有使用index ,如果使用了index,根本不会使用block nested join算法

Contents
  1. 1. JOIN 语法
  2. 2. JOIN算法
    1. 2.1. simple nested loop join
    2. 2.2. index nested loop join
    3. 2.3. block nested loop join